这项研究工作是关于语音识别的最新发展。在这项研究工作中,在存在不同的比特速率和不同噪声水平的情况下对孤立的数字识别的分析。这项研究工作是使用Audacity和HTK工具包进行的。隐藏的马尔可夫模型(HMM)是用于执行此实验的识别模型。所使用的特征提取技术是MEL频率CEPSTRUM系数(MFCC),线性预测编码(LPC),感知线性预测(PLP),MEL SPECTRUM(MELSPEC),FILLE BANK(FBANK)。已经考虑了三种不同的噪声水平来测试数据。这些包括随机噪声,风扇噪声和实时环境中的随机噪声。这样做是为了分析可用于实时应用程序的最佳环境。此外,考虑到不同采样率的五种不同类型的常用比特率,以找出最佳的比特率。
translated by 谷歌翻译
Anomaly analytics is a popular and vital task in various research contexts, which has been studied for several decades. At the same time, deep learning has shown its capacity in solving many graph-based tasks like, node classification, link prediction, and graph classification. Recently, many studies are extending graph learning models for solving anomaly analytics problems, resulting in beneficial advances in graph-based anomaly analytics techniques. In this survey, we provide a comprehensive overview of graph learning methods for anomaly analytics tasks. We classify them into four categories based on their model architectures, namely graph convolutional network (GCN), graph attention network (GAT), graph autoencoder (GAE), and other graph learning models. The differences between these methods are also compared in a systematic manner. Furthermore, we outline several graph-based anomaly analytics applications across various domains in the real world. Finally, we discuss five potential future research directions in this rapidly growing field.
translated by 谷歌翻译
We present a robust methodology for evaluating biases in natural language generation(NLG) systems. Previous works use fixed hand-crafted prefix templates with mentions of various demographic groups to prompt models to generate continuations for bias analysis. These fixed prefix templates could themselves be specific in terms of styles or linguistic structures, which may lead to unreliable fairness conclusions that are not representative of the general trends from tone varying prompts. To study this problem, we paraphrase the prompts with different syntactic structures and use these to evaluate demographic bias in NLG systems. Our results suggest similar overall bias trends but some syntactic structures lead to contradictory conclusions compared to past works. We show that our methodology is more robust and that some syntactic structures prompt more toxic content while others could prompt less biased generation. This suggests the importance of not relying on a fixed syntactic structure and using tone-invariant prompts. Introducing syntactically-diverse prompts can achieve more robust NLG (bias) evaluation.
translated by 谷歌翻译
Learning rich skills through temporal abstractions without supervision of external rewards is at the frontier of Reinforcement Learning research. Existing works mainly fall into two distinctive categories: variational and Laplacian-based option discovery. The former maximizes the diversity of the discovered options through a mutual information loss but overlooks coverage of the state space, while the latter focuses on improving the coverage of options by increasing connectivity during exploration, but does not consider diversity. In this paper, we propose a unified framework that quantifies diversity and coverage through a novel use of the Determinantal Point Process (DPP) and enables unsupervised option discovery explicitly optimizing both objectives. Specifically, we define the DPP kernel matrix with the Laplacian spectrum of the state transition graph and use the expected mode number in the trajectories as the objective to capture and enhance both diversity and coverage of the learned options. The proposed option discovery algorithm is extensively evaluated using challenging tasks built with Mujoco and Atari, demonstrating that our proposed algorithm substantially outperforms SOTA baselines from both diversity- and coverage-driven categories. The codes are available at https://github.com/LucasCJYSDL/ODPP.
translated by 谷歌翻译
Object-goal navigation (Object-nav) entails searching, recognizing and navigating to a target object. Object-nav has been extensively studied by the Embodied-AI community, but most solutions are often restricted to considering static objects (e.g., television, fridge, etc.). We propose a modular framework for object-nav that is able to efficiently search indoor environments for not just static objects but also movable objects (e.g. fruits, glasses, phones, etc.) that frequently change their positions due to human intervention. Our contextual-bandit agent efficiently explores the environment by showing optimism in the face of uncertainty and learns a model of the likelihood of spotting different objects from each navigable location. The likelihoods are used as rewards in a weighted minimum latency solver to deduce a trajectory for the robot. We evaluate our algorithms in two simulated environments and a real-world setting, to demonstrate high sample efficiency and reliability.
translated by 谷歌翻译
Time series anomaly detection has applications in a wide range of research fields and applications, including manufacturing and healthcare. The presence of anomalies can indicate novel or unexpected events, such as production faults, system defects, or heart fluttering, and is therefore of particular interest. The large size and complex patterns of time series have led researchers to develop specialised deep learning models for detecting anomalous patterns. This survey focuses on providing structured and comprehensive state-of-the-art time series anomaly detection models through the use of deep learning. It providing a taxonomy based on the factors that divide anomaly detection models into different categories. Aside from describing the basic anomaly detection technique for each category, the advantages and limitations are also discussed. Furthermore, this study includes examples of deep anomaly detection in time series across various application domains in recent years. It finally summarises open issues in research and challenges faced while adopting deep anomaly detection models.
translated by 谷歌翻译
The substitute-based recommendation is widely used in E-commerce to provide better alternatives to customers. However, existing research typically uses the customer behavior signals like co-view and view-but-purchase-another to capture the substitute relationship. Despite its intuitive soundness, we find that such an approach might ignore the functionality and characteristics of products. In this paper, we adapt substitute recommendation into language matching problem by taking product title description as model input to consider product functionality. We design a new transformation method to de-noise the signals derived from production data. In addition, we consider multilingual support from the engineering point of view. Our proposed end-to-end transformer-based model achieves both successes from offline and online experiments. The proposed model has been deployed in a large-scale E-commerce website for 11 marketplaces in 6 languages. Our proposed model is demonstrated to increase revenue by 19% based on an online A/B experiment.
translated by 谷歌翻译
我们研究Claire(一种差异性多形状,多-GPU图像注册算法和软件)的性能 - 在具有数十亿素素的大规模生物医学成像应用中。在这样的分辨率下,大多数用于差异图像注册的软件包非常昂贵。结果,从业人员首先要大量删除原始图像,然后使用现有工具进行注册。我们的主要贡献是对降采样对注册性能的影响的广泛分析。我们通过将用Claire获得的全分辨率注册与合成和现实成像数据集的低分辨率注册进行比较,研究了这种影响。我们的结果表明,完全分辨率的注册可以产生卓越的注册质量 - 但并非总是如此。例如,将合成图像从$ 1024^3 $减少到$ 256^3 $将骰子系数从92%降低到79%。但是,对于嘈杂或低对比度的高分辨率图像,差异不太明显。克莱尔不仅允许我们在几秒钟内注册临床相关大小的图像,而且还可以在合理的时间内以前所未有的分辨率注册图像。考虑的最高分辨率是$ 2816 \ times3016 \ times1162 $的清晰图像。据我们所知,这是有关此类决议中图像注册质量的首次研究。
translated by 谷歌翻译
最近已证明,平均场控制(MFC)是可扩展的工具,可近似解决大规模的多代理增强学习(MARL)问题。但是,这些研究通常仅限于无约束的累积奖励最大化框架。在本文中,我们表明,即使在存在约束的情况下,也可以使用MFC方法近似MARL问题。具体来说,我们证明,一个$ n $ agent的约束MARL问题,以及每个尺寸的尺寸$ | \ Mathcal {x} | $和$ | \ Mathcal {u} | $的状态和操作空间,可以通过与错误相关的约束MFC问题近似,$ e \ triangleq \ Mathcal {o} \ left([\ sqrt {| \ Mathcal {| \ Mathcal {x} |} |}+\ sqrt {| ]/\ sqrt {n} \ right)$。在奖励,成本和状态过渡功能独立于人口的行动分布的特殊情况下,我们证明该错误可以将错误提高到$ e = \ nathcal {o}(\ sqrt {| | \ Mathcal {x x x } |}/\ sqrt {n})$。另外,我们提供了一种基于自然策略梯度的算法,并证明它可以在$ \ Mathcal {o}(e)$的错误中解决受约束的MARL问题,并具有$ \ MATHCAL {O}的样本复杂性(E^{ - e^{ - 6})$。
translated by 谷歌翻译
在包括搜索在内的各种应用程序中,积极消费数字文档的研究范围为研究范围。传统上,文档中的搜索是作为文本匹配的问题施放的,忽略了结构化文档,表格等中常见的丰富布局和视觉提示。为此,我们提出了一个大多数未探索的问题:“我们可以搜索其他类似的snippets在目标文档页面中存在给定文档摘要的单个查询实例吗?”。我们建议单体将其作为单拍的摘要检测任务解决。单体融合了摘要和文档的视觉,文本和空间方式的上下文,以在目标文档中找到查询片段。我们进行了广泛的消融和实验,显示单体从一击对象检测(BHRL),模板匹配和文档理解(Layoutlmv3)中优于几个基线。由于目前的任务缺乏相关数据,因此我们对单体进行了编程生成的数据训练,该数据具有许多视觉上相似的查询片段和来自两个数据集的目标文档对 - Flamingo表单和PublayNet。我们还进行人类研究以验证生成的数据。
translated by 谷歌翻译